On the Evidence for Axion-like Particles from Active Galactic Nuclei 
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Burrage, Davis, and Shaw [1] recently suggested exploiting the correlations between high and low 
energy luminosities of astrophysical objects to probe possible mixing between photons and axion- 
like particles (ALP) in magnetic field regions. They also presented evidence for the existence of 
ALP's by analyzing the optical/UV and X-ray monochromatic luminosities of AGNs. We extend 
their work by using the monochromatic luminosities of 320 unobscured Active Galactic Nuclei from 
the Sloan Digital Sky Survey/Xmm-Newton Quasar Survey [2], which allows the exploration of 18 
different combinations of optical/UV and X-ray monochromatic luminosities. However, we do not 
find compelling evidence for the existence of ALPs. Moreover, it appears that the signal reported 
by Burrage et al. is more likely due to X-ray absorption rather than to photon-ALP oscillation. 



I. INTRODUCTION 

Very light scalar fields could have had a significant 
impact on cosmology, potentially acting as dark matter 
{e. g., axions or axion-like particles) or explaining the re- 
cent accelerated expansion {e. g., a quintessence field). In 
both cases, their roles are primarily gravitational, either 
to provide additional gravitational clustering on galac- 
tic or cluster scales, or to drive the overall expansion 
of the Universe. However, this does not rule out non- 
gravitational interactions, which should also exist if not 
explicitly forbidden by some symmetry [3]. 

The strengths of such interactions are well constrained 
for axion-like particles (ALPs). These interactions could 
lead to more efficient stellar cooling, and the limits from 
solar axions observed on Earth constrain the couplings 
to be g < 10-i°GeV-i [4]. Additionally, ALPs can 
be emitted from the core of supernovae at a significant 
rate [5]; the lack of evidence for such outburst from SN 
1987A yields g < 10"" GeV"! for a very Ught ALP 
(to < 10~^eV) [4]. Similarly in the quintessence case, 
such interactions between photons with a slowly rolling 
field would lead to time variations in the fine structure 
constant which could be observed in stellar lines [3]. 

One way of avoiding these constraints is the so-called 
chameleon model [6], where non-minimal couplings to 
gravity lead to the effective mass or coupling of the scalar 
field being dependent on the local mass density. In this 
way, many of the constraints on the mass of the scalar 
fields can be satisfied, while still allowing for a significant 
interaction strength in regions where the Universe is less 
dense [7, 8]. 

It is worth trying to constrain couplings of the axion- 
like particles in low density regions where they are not 
masked by chameleon effects. Typical interactions couple 
the axion-like particle to two photons; this can lead to 
photons decaying into axions as they pass through mag- 
netic fields. If the probability is small, this leads to 'tired 



'Electronic address: Guido.Pcttinari@port.ac.uk 



light' scenarios, where objects at large distances are pro- 
gressively dimmer than expected. These ideas have been 
investigated in the context of Type la supernovae, where 
the effect is similar to that of grey dust [9-12] but still 
cannot explain their apparent dimming without intro- 
ducing cosmic acceleration (see Ref. [13] and references 
therein). 

In the opposite strong-mixing regime, the photons 
could convert to axions rarely, but with high probabil- 
ity, as they pass through the magnetic fields around a 
galaxy or cluster. On average, the mixing will result in 
one third of the photons being converted into ALP's, but 
the exact amount will depend on the magnetic field orien- 
tations along the individual photon paths. Given the rel- 
atively short coherence lengths associated with such mag- 
netic fields, the mixing is expected to vary from source 
to source. 

The average suppression of such strong mixing is diffi- 
cult to detect without well calibrated sources at cosmo- 
logical distances. However Burrage, Davis, and Shaw [1] 
(hereafter, BDS) recently proposed using the distribution 
of the fluxes of cosmological sources as a means of con- 
straining the mixing. Since the mixing is dependent of 
the photon energy, the low energy fluxes are used to cal- 
ibrate the brightness of the sources. After considering a 
number of potential sources, BDS analysed the distribu- 
tion of X-ray fluxes of AGN, normalised by their optical 
fluxes and found significant evidence of such mixing, up 
to the 5a level in their most recent analysis [14]. 

Such strong evidence for ALP mixing is tantalizing, 
and the purpose of this paper is to re-examine and extend 
their analysis with larger data sets to see how robust the 
signal is. In section II, we review the dynamics of the 
mixing model and the expected signal. In section HI, 
we examine a number of possible ways to evaluate the 
statistical significance of such a signal, and in section IV 
we apply these methods, re-examining the original claim 
of BDS and then extending the analysis to the larger data 
set of Young et al. [2] , before concluding in section V. 



II. MODEL ASSUMPTIONS 

A. Scatter from mixing to Axion-like particles 

For our purposes, an axion-like particle (ALP) is a 
light but massive scalar or pseudo-scalar field that cou- 
ples with the kinetic term of the Lagrangian of the pho- 
tons. Depending on whether the ALP is a scalar or a 
pseudoscalar, this interaction term has one of the follow- 
ing forms: 
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where is the axion field and M = 1/g sets the scale of 
the strength of the photon-ALP coupling. As a result, 
when a photon of energy E travels through a magnetic 
domain of length L and intensity B, there is a non-zero 
probability that it oscillates into an ALP [15-17]: 
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where Ecrit — "T'3,/2m is the characteristic energy scale, 
ji = B/M, and m^ is the ALP mass. 

We are most interested in ALP-Photon mixing in an 
astrophysical context where the propagating medium is 
an electron plasma. Therefore, we have to substitute 
an effective mass for the mass of the ALP, m^ — >■ rrioff, 
where, 
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here, oj^j — 47racm"-e/^7ie is the plasma frequency, acm is 
the fine structure constant, n^ is the free electron num- 
ber density and rrie is the mass of the electron. The 
parameter e can be either 1 (scalar ALP) or (pseudo- 
scalar ALP); in the following discussion we shall always 
assume e — 0, i. e. a pseudo-scalar axion-like particle. 
In the typical environments we are interested in, the free 
electron densities are of order 10~^ — 10^'^ cm^"^, lead- 
ing to plasma frequencies of order 10^^^ eV. For very 
light masses {m^ < 10~^^eV), the effective mass will be 
dominated by this plasma frequency. 

Here, we focus on mixing which would occur as photons 
transverse a typical intra-cluster medium, where there 
are magnetic domains of coherent length L ^ 1 — 100 Kpc 
and intensity B ^ I — 10 /xG [18]. Passing through a 
whole cluster of length 1 Mpc, the photons will encounter 
a number (N ^ 1) of independent magnetic domains. 
Initially we shall assume that every light path either 
crosses a cluster while travelling towards us or is orig- 
inated inside a cluster. This assumption is optimistic 
and we shall discuss what happens when it is relaxed in 
Sec. III. 



We are interested in the strong mixing limit and when 
the mixing is independent of the photon energy. Strong 
mixing occurs in a single domain when BL/2M > 1; 
that is, if the magnetic fields are sufficiently strong, are 
coherent over a large enough region or the coupling to 
the ALP is high enough. For typical cluster magnetic 
fields, strong mixing in a single coherent region requires 
M < 10^^ GeV. However, even if the mixing probability 
is small in a given region, strong mixing over the whole 
path still occurs as long as N Pj^^ = N {BL/2MY > 1- 
If BL/2M > 1 , energy independence holds if £' ^ -E^crit; 
for weaker mixing, the energy independence extends to 
lower energies {E > E„it{BL/2M)). 

BDS estimate that the frequency independent and 
strong mixing limits are both reached when E > 0.3 — 
3keV if one assumes m^ < lO'^^ ^y, Af < 3 x 10" GeV 
and typical properties for the intra-cluster medium. 
Thus, one expects the effects of mixing to be most sig- 
nificant for X-rays and 7-rays, and be relatively small 
for softer photons such as those in the optical or UV 
bands. The test we describe will exploit this, by tak- 
ing the optical luminosities as a direct indicator of the 
true luminosity in order to normalise the luminosities in 
harder bands, where mixing could be significant. 

In the strong mixing and frequency independent 
regimes, BDS found that, beginning with a pure pho- 
ton beam, its intensity along a given line of sight will be 
decreased by a random factor given by: 
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where po is the initial degree of polarisation and Ki and 
K2 are uniformly distributed random variables over the 
interval [—1, 1]. The resulting probability distribution of 
this ratio is given by 
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where a = (1 — po)/(l +Po)i c± = min(c, (1 ±pq)/2) and 
Po € [0,1) is the amount of linear polarisation^. This 
unusual PDF is shown in Figure 1 for three different val- 
ues of po ■ The expectation value of C given by Eq. (3) 
is independent of po and amounts to C = 2/3, meaning 
that, on average, one-third of emitted photons that cross 
N ^ 1 magnetic domains is turned into axions. Its stan- 
dard deviation increases with p^ and is in the range 0.2 
-0.3. 



^ For details on the polarisation induced by photon-ALP mixing 
refer to Burrage et al. [19]. 



The curves are perhaps simplest to understand in the 
fully polarised case, where K2 drops out of Eq. 2, leaving 
the PDF described by the Jacobian of the transformation 
from Ki to C, giving fc{c) = \{^~ c)~^^'^ . In the other 
cases, one must marginalise over the residual degree of 
freedom; the hard boundaries on the distributions of the 
K^s lead to the features at c = (1 ± po)/2, as seen in 
Figure 1. 




FIG. 1 - Probability distribution function of C, the ratio 

between the photon and the total intensity, when po — 

0., 0.5, 1.. 



B. Other sources of scatter 

If we had perfectly calibrated sources, where we 
knew their distances, it would be straightforward to tell 
whether they were dimmed due to photon-ALP mixing 
with some scalar field. In the absence of such calibrated 
sources, BDS proposed to exploit empirical relations be- 
tween the luminosities in different bands of certain classes 
of astrophysical objects. The basic idea is that we have 
observations of objects in a large range of frequency 
bands; low frequency luminosities are assumed not to be 
affected by mixing, and so are taken to be an indicator 
of the true high frequency luminosity of the object, as- 
suming some empirical relation. We then examine how 
the observed high frequency luminosities relate to that 
predicted; we focus on the characteristic scatter caused 
by the photon-ALP mixing, as any average decrease in 
the luminosity is absorbed in the empirical relations. 

We require a relation between a quantity that is af- 
fected by photon-ALP mixing (Y), such as X-ray or 7- 
ray energy, and one that is unaffected (X), such as the 
optical luminosity or some other low energy feature of 
the light curve. A number of such empirical laws exist 
for AGNs [2, 20-26], Blazars [27, 28] and 7-ray bursts 
[29]. BDS focused on those in the form of a power law: 



here, the definition of Y and X depends on the empirical 
law we are considering. If photon-ALP mixing occurs, 
however, we never observe Y but rather its "dimmed" 
counterpart CY (the same does not apply for X which 
is assumed to be unchanged by photon-ALP mixing.) 

Even in the absence of photon-ALP mixing, it is unrea- 
sonable to assume that low frequency and high frequency 
luminosities are perfectly correlated, as there could be 
many factors affecting these luminosities which vary from 
object to object. The origin of this intrinsic scatter de- 
pends on the physics of the emission of the different en- 
ergy photons, which varies according to the type of object 
under consideration. (See Sec. IV A for some specific ex- 
amples in the AGN case.) The intrinsic scatter is usually 
assumed to be a Gaussian distributed random variable 
with zero mean, criniV(0, 1). Thus, our final data model 
is 

logio(r) = a + b logio(X)+logio(C) + (7i„iV(0,l). (5) 

Here, C"s distribution is given by fc in Eq. (3) and in- 
cluded only when photon-ALP mixing is assumed. The 
resulting probability distribution for the total scatter 
S = logjo(C) + o'inA^(0, 1) is the convolution between 
fc and the Guassian distribution: 
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To determine whether photon-ALP mixing actually oc- 
curred, we compare two different models: the Gaussian 
model, where the scatter from the empirical relation is 
simply Gaussian, and the ALP-mixing model, where the 
dimming due to photon-ALP mixing is super-imposed 
on the Gaussian scatter. Note that Uin of the Gaussian 
scatter is empirically determined, and is normalised to 
match the observed scatter. In the mixing model, it is 
assumed there is less intrinsic scatter, in order to keep 
the total scatter constant. In fact, there is a minimum 
amount of scatter predicted by the photon-ALP mixing 
model, and if a probe were found with less scatter than 
this, we could rule out the possibility of strong mixing 
for that probe. The typical scatter coming from mixing 
alone is cTmix = 0.2, but this varies with the degree of 
initial polarisation. 

Another factor to consider is the fraction of light paths 
which cross sufficiently magnetized regions to experience 
strong mixing. There is no guarantee that any given 
source will live in a cluster environment, or that its light 
will pass through such an environment on its way to us. 
Obviously, the less likely this is, the harder it will be to 
constrain the coupling to axion-like particles. The final 
distribution will be a linear combination of the mixed 
distribution and the intrinsic distribution, weighted by 
the fraction of photon paths which experience mixing 
(-Pmix)- For small Pmix the observed variance will be dom- 
inated by the intrinsic variance; the resulting distribution 
is very nearly Gaussian, but the mixing will significantly 



increase the low luminosity tail. The dependence of the 
likelihood from Pmix is shown in Figure 2. 
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FIG. 2 - Probability distribution of the scatter in the ALP- 
mixing model for various Pmix values. We assumed po = 0.1 
and (Tin = 0.2. For small Pmix, the distribution is nearly 
Gaussian, apart from the low intensity tail. Here and below 
we focus on the low end tail, where the relative deviations 
are largest. 



rough proxy for the true effect, which is sensitive to the 
full distribution and in particular to the tail at low lumi- 
nosities. Also, if the fraction of sources which are strongly 
mixed is relatively low, then the shot noise can be a more 
significant issue. See Figure 3 for the probability distri- 
bution of the scatter in presence of shot-noise for various 
(A^) values. In Sec. VC we quantify this effect further 
for the statistical tests we consider. 
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TABLE I - Variance of the shot-noise contribution to the 
scatter for different average photon counts. The variance of 
the ALP contribution to the scatter is at least a% = 0.033. 
In order to calculate a^ we did not consider the case in 
which N = photons are collected, otherwise the variance 
would have been infinite. This is reasonable since A = 
implies no measurement at all. 



C. Shot noise 

Finally, another potentially important contribution to 
the intrinsic scatter could be from shot noise in the num- 
bers of X-ray or 7-ray photons detected. This is of inter- 
est because it follows a Poisson distribution rather than 
Gaussian, and could have similar effects as the photon- 
ALP mixing on the scatter. The inferred X-ray luminos- 
ity is proportional to N, while the intrinsic luminosity 
is proportional to (N)^ so that we have an additional 
contribution to the luminosity ratio which is logj^p ^/(^>j 
where A'^ is assumed to have a Poission distribution. 

A very rough idea of the impact of the shot noise can 
be estimated by comparing its variance to that arising 
from the photon-ALP mixing. The photon-ALP mix- 
ing variance increases with the initial polarisation, but 
its minimum value is a\ = 0.033 when po = 0- The 
variance from shot noise depends on the number of pho- 
tons observed, and the associated variances are shown 
in Table L As can be seen, the variances are only com- 
parable to (t\ for N < 50, while typical surveys exceed 
this. In the catalogs we consider below, all the objects 
have more than 50 net counts, with an average which is 
greater than 1000. Thus, we do not expect naively that 
shot noise will be a major issue for these observations. 
On the other hand, objects in the catalogs considered by 
BDS have photon counts as low as 10, with an average 
of 120; half of these sources have less than 50 net X-ray 
counts. 

However, the contribution to the variance is only a 




FIG. 3 - Probability distribution of the scatter taking into 
account shot-noise for various values of collected X-ray pho- 
tons. We assumed ain = 0.2 for the intrinsic Gaussian 
noise. Like the ALP-mixing model, shot noise can increase 
the low luminosity tail. 



III. STATISTICAL PROCEDURE 

A. Bayes ratio 

We look at a number of statistical tests to examine 
whether the data support the ALP-mixing model. The 
Bayesian approach to comparing two models is to com- 



pare their Bayesian evidences; the evidence is the likeh- 
hood of the observed data given a model {A), integrated 
over the model parameters (p^) and accounting for the 
prior distributions of the parameters: 



r{T>\A) = J d'^pA V{B\A, pa) V{pa\A) 



(7) 



Here we wish to compare the evidences for the Gaussian 
model and the ALP-mixing model; for simplicity we fix 
the initial polarization po and the mixing fraction, Pmix- 
The resulting models have the same parameters a, b and 
tTin, and we assume the same flat prior distribution for 
these parameters. 

If we make the further simplifying assumption that 
the shapes of the likelihoods of the models are the same 
(which must be established to first order by comparing 
the a, b and ain error bars of the models), the Bayes ratio 
of the models is simply given by the ratio of the peaks of 
the likelihoods: 
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Here, p are the parameters which maximise the likeli- 



hoods of the two models. This ratio is effectively equiv- 
alent to the r quantity used by BDS: 
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If r > 0, then the ALP-mixing model is preferred over 
the Gaussian model; if r < the opposite is true. The 
absolute value of r is distributed to a good approximation 
as a x^ random variable with one degree of freedom. For 
example, \r\ =9 corresponds to a '-^ 3 a preference for 
one model over the other. 



Below we will be comparing the likelihoods of the var- 
ious model parameters for different sets of data. For the 
Gaussian model, the parameters include the amplitudes 
a and b, and the intrinsic variance crjn. For the mixing 
model, the parameters also include po a-nd Pmix- The 
data are simply a collection of N observations of inten- 
sities of various sources in two different bands, Xi and 
Yi, which are taken to be independent. The full resulting 
likelihood of the parameters given the data is: 
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where Si{a,b) = logio(^i) - a - felogio(^i), Zi{a,b) = 
Si{a, b) — logiQ^ C) and we assumed Pmix does not depend 
on redshift. 

As discussed above, we then perform a Maximum Like- 
lihood Estimate (MLE) of the parameters (a, b, din) for 
the Gaussian model (P,nix — 0) and the ALP-mixing 
model, leaving po and Pmix fixed. As a result, we obtain 
(i) two sets of parameters: (a, b, ain)G and (a, &, (Tin)ALP 
and (a) the respective maximized likelihoods: Lq and 

Lalp (Poi-Pmix)- 



B. Goodness-of-fit tests 

The Bayes ratio is the best means of comparing two 
models, but it does not examine whether either model 
provides a good fit to the data. BDS also looked at 
Bootstrap [30] resamplings of the data sets, and showed 
plots of the variance versus skewness of the data (their 
so-called 'fingerprints'), comparing these to what is ex- 
pected in the Gaussian and ALP models. In Sec. IV A we 
shall reproduce their analysis. In particular, we will show 
that much of the structure in these fingerprints arises 
from resampling of a few outliers multiple times and the 



resulting significance of such plots is hard to quantify. 

Instead, we focus on the 1-D cumulative distributions 
of the scatter around the mean behavior which contain 
all of the relevant information. For this kind of un- 
binned data, a standard goodness-of-fit statistic is the 
Kolmogorov-Smirnov (KS) test, which looks at the max- 
imum difference of the cumulative distributions. As we 
are fitting for parameters of the distribution, we simulate 
the process to see how often the observed KS statistic oc- 
curs in the two models. 

As we shall see, statistics like the Bayes ratio are domi- 
nated by a few sources where the X-ray intensity is much 
lower than expected. The KS test is not greatly sensi- 
tive to the tails of the distribution, so we also examine 
some related statistical tests, the Kuiper test and the 
Anderson-Darling (AD) statistic. (A description of these 
can be found in Press et al. [30].) Briefly, the Kuiper 
statistic is the sum of the largest positive and negative 
difference in the observed and theoretical cumulative dis- 
tributions, while the AD statistic is a rcnormalised ver- 
sion which gives more importance to the tails of the dis- 
tribution. Again, these tests are calibrated using simula- 
tions of the full process. 

As is evident in the cumulative plots below, while the 
presence of the outliers can strongly favor the ALP distri- 



bution over the Gaussian, often many more outliers are 
seen than is expected by either modeL This most likely 
suggests that neither model is correct and that another 
explanation could be required for the low X-ray luminos- 
ity of some sources. One strong candidate is that the soft 
X-rays are sometimes strongly absorbed. 



IV. DATA ANALYSIS 

A. Previous analyses 

To discriminate between the Gaussian model and the 
ALP-mixing model, BDS analysed several classes of as- 
trophysical objects where an empirical law of the form in 
Eq. (4) is valid. Examples of such objects are Blazars, 
7-ray bursts and Active Galactic Nuclei (AGN). They 
obtained relevant results using a well-known correlation 
between the 2500 A and 2 keV monochromatic luminosi- 
ties of AGNs [2, 20-26]: 



logio (L 2500 A ) = a + ^ logio (-^ 2 
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BDS took into consideration 77 optically selected AGN 
with redshifts less than 2.7 taken from Steffen et al. [24]. 
Of these, 32 are from the COMBO-17 survey [31] and 45 
from the Bright Quasar Survey (BQS) [32]. These two 
sets are matched with X-ray measurements coming re- 
spectively from the Extended Chandra Deep Field-South 
survey [33] and the ROSAT experiment [34, 35]. We will 
refer to this set of AGNs as the BDS-77 catalog. By 
analysing it and assuming Pmix = 1, BDS obtained: 

?'bo,-Pmix = l)- 14 

for < Po < 0.4 and r(po) > H for po > 0.4. This 
corresponds respectively to a 3.7 cr and 3.3 cr evidence in 
favor of the ALP-mixing model. 

Burrage et al. [14] recently extended their sample to 
203 AGNs by including 126 more AGNs with redshifts 
less than 3.8 from Strateva et al. [23], optically selected 
from the Sloan Digital Sky Survey [36] and matched for 
the most part with ROSAT X-ray data [34, 35]. The net 
X-ray counts are in the range 10 - 1500 with an average 
of 120 counts and 13% of AGNs below 20 counts. We will 
refer to this set of AGNs as the BDS-203 catalog. Using 
this catalog and assuming Pmix — 1, they obtained the 
following result: 

r{po < 0.5, P„ux = 1) ^ 25 , 

which corresponds to a 5cr evidence in favor of the ALP- 
mixing model. 

Below, we will quote only results where po = 0.1 since 
(i) this is the amount of linear polarisation predicted for 
AGN light [37] at £; = 2 keV and (ii) the likelihood is 
almost insensitive to changes in pq. 



The problem of X-ray absorption 

In addition to photon-ALP mixing, there are many 
other possible sources of scatter in the intrinsic relation 
between an AGN's optical or UV luminosity and its X- 
ray luminosity. The X-rays are thought to arise from 
the hot coronal gas, while the lower energy photons are 
believed to radiate from the AGN accretion disk. X-ray 
emission can be enhanced if there are jets, or suppressed 
if the coronae are absent or disrupted, or if there is signifi- 
cant absorption, which might occur as a result of outflows 
[26]. In addition, variation may occur because the time 
dependent X-ray and optical luminosities are measured 
at different epochs, though recent data have suggested 
that most of the scatter remains even for coeval observa- 
tions [38]. 

Here we focus primarily on the effects of mixing, but 
if evidence suggests that there may be mixing, other 
sources of scatter must also be considered to explain the 
data. The r-test employed by BDS is a simple likelihood 
ratio test. The significant preference shown by the BDS- 
77 and BDS-203 catalogs for the ALP-mixing model over 
the Gaussian model only demonstrates the relative fits 
of the models, but does not imply that either is actu- 
ally a good fit to the data. One way to evaluate the 
goodness of fit is to look at the Cumulative Distribution 
Functions (CDF). In Figure 4 we plot the CDF of the 
scatter according to the two models against the empir- 
ical CDF coming from these catalogs. For both AGN 
sets, neither of the theory curves manages to reproduce 
the features of the scatter, though the Gaussian model 
is a much worse fit. The empirical CDF is much larger 
than the model CDFs at the low end of the scatter axis, 
meaning that the catalogs contain several objects with an 
X-ray to UV/optical luminosity which is much lower than 
what is likely via the X-ray dimming due to photon-ALP 
mixing. 

This effect can be naturally explained if we assume 
that X-ray light from these AGNs has been absorbed. 
Steffen et al. [24] attempted to remove this possible con- 
tamination by excluding AGNs with flat X-ray spectra 
according to the effective X-ray power-law photon index 
r (some details on this procedure can be found in Sec. 
IV B 2); however, this was only possible for the Chan- 
dra subset, leaving some potentially contaminated AGN 
which can dominate the statistics. Unfortunately, the r 
statistic is easily dominated by outliers, so that even a 
small contamination can signiflcantly bias the result. 

To emphasise the effect of the unaccounted for X-ray 
absorption, we exclude from the BDS-77 and BDS-203 
catalogs one z = 0.067 AGN (PG2214+139, also known 
as MKN 304), which is known to be heavily obscured 
in the X-ray [39, 40]. This source has a very flat X-ray 
spectrum dominated by strong absorption features due to 
ionized gas. The gas is well modeled by a two- component 
warm gas, which yields one of the highest column den- 
sity in XMM-Newton and Chandra. PG2214-t-139 is the 
biggest outlier in both datasets and has a scatter ~ 4.5 
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FIG. 4 - Cumulative Distribution Functions of the scatter for the BDS-77 catalog (left panel) and BDS-203 (right panel). 
Each empirical CDF is plotted along with the ALP-mixing model and Gaussian model theoretical CDFs. We assumed 

po = 0.1 and P^ix = 1. 



standard deviations below the average. It is actually vis- 
ible in Figure 4 as the leftmost point in both plots. By 
repeating the r-test without PG2214-f 139, we obtain a 
drop in the r-statistic of 64% for the BDS-77 catalog and 
of 36% for the BDS-203 catalog. 

Another AGN in the BDS-203 catalog that deserves 
special attention is TGN336Z208, known in SDSS as 
SDSSJ134351.12-f000434, which has been spectroscop- 
ically classified [41] as a Broad Absorption Line (BAL) 
AGN. It is thought that BALs are AGNs viewed through 
the non-spherical wind that surrounds the accretion disk 
of the supermassive black hole powering AGN emission 
(see Murray et al. [42]). The X-ray absorption result- 
ing from this obscuration renders BAL AGNs unsuited 
to study the intrinsic correlation between their UV and 
X-ray emissions [43, 44].^ Moreover, Strateva et al. [23] 
only report 24 net X-ray counts for TGN336Z208, a num- 
ber which is too low to permit model fitting on the spec- 
trum. By removing both SDSSJ134351. 12+000434.8 and 
PG2214+139 from the BDS-203 catalog, the r statistic 
drops from r ~ 25 to r ~ 7, a '^ 70% decrease. 

Inevitably, our focus was drawn to these particular 
AGN because they were less X-ray bright and dominate 
the Bayes ratio test; this means that some care must be 
taken to treat all the data consistently to avoid a poste- 
rior bias which would occur from arguing selectively to 
omit those AGN with the highest scatter. However, we 
have shown that there are independent reasons, based on 
their spectroscopic properties, for excluding these from 
the sample. In any case, these results emphasise the great 
sensitivity of this test to outliers and the need to ensure 
a sample free from absorption. 



^ The selection criteria in forming the BDS-203 catalog [23] in- 
cluded removal of BALs according to the properties of the C IV 
and Mg II absorption lines. However, these can be seen in SDSS 
spectra only for sources with z > 1.55 and z > 0.45 respectively, 
while TGN336Z208 has z = 0.0736. 



Bootstrap resamplings 

As further evidence of the ALP-mixing model, BDS 
introduced the concept of 'fingerprints', which are plots 
based on Bootstrap resamplings of the AGN data set. 
They created 10^ data sets derived from the BDS-77 cat- 
alog, each of the same size as the original catalog, by 
sampling it with replacement. For each of these resam- 
pled data sets, they calculated the moments of the dis- 
tributions, defined as 



i=l 



Ijr. 



S^ 



where N is the number of AGNs in the sample and Si is 
the scatter of the i-th AGN. The fingerprints are scatter 
plots of the moments (e.g. the variance, k2, versus the 
skewness, fcs) for all of the resampled data sets. We have 
reproduced this analysis and it can be seen in Figure 5a. 

BDS showed that there were similarities between the 
scatter plots generated from the data and those generated 
from a sample simulated with the ALP-mixing model 
which were not seen in the simple Gaussian case. Ex- 
ample simulations can be seen in Figures 5b and 5c. The 
data and the simulations both share a similar tail to large 
variance and negative skewness, which are rare in Gaus- 
sian simulations. In addition, the data and Figure 5c also 
have a similar substructure in this tail. 

However, these fingerprints are similarly sensitive to 
outliers in the data set which is being resampled. For 
example, the substructure apparent in Figure 5c is due 
to a single outlier more than Aa away from the mean. 
In some resamplings, this outlier does not appear, re- 
sulting in two islands of low variance and small skewness 
(which can be positive or negative); in other resamplings 
it can appear once, resulting in another island with higher 
variance and more negative skewness. The outlier can 
be resampled multiple times, and islands of substructure 
arise associated with the outlier appearing two, three or 



even four times. The actual data is similar, but there is 
one large outlier and one moderate outlier which provide 
somewhat finer structure. 

Substructure, since it results from outliers, is more 
likely to arise from the ALP-mixing model than in the 
Gaussian model because of its higher tail (Figure 2.) 
However, even in the ALP-mixing model such substruc- 
ture is not common in resamplings based on typical simu- 
lations; a more typical result is shown in Figure 5b. Even 
in the data, when the data sets get bigger, the impact of 
a single data point is smaller and the substructure is less 
apparent. (For example, significantly less substructure 
was seen in the BDS-203 catalog.) 

The tail to large variance and negative skewness does 
remain in the data and the simulations of the ALP- 
mixing model and is absent from the Gaussian models. 
This reflects the fact that both distributions are skewed, 
having larger tails on the side with lower luminosities. 
However, the similarities are qualitative, and the finger- 
prints have not been used to quantify the size of the tails, 
where the data and the ALP-mixing model differ signifi- 
cantly (as was seen in Figure 4.) 
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FIG. 5 - Fingerprint plots of variance versus skewness. 
Each point represents a Bootstrap resampling of 77 data 
points (a) from the BDS-77 catalog of AGNs, (b) simu- 
lated with the ALP-mixing model and (c) simulated with 
the ALP-mixing model, but where there happens to be a 
large outlier. 



B. Defining our AGN samples 

Here we take advantage of a new AGN sample, us- 
ing more than 300 AGNs from the Fifth Data Release 
Sloan Digital Sky Survey/Xmm-Newton Quasar Survey 
(Young et al. [2, 45]). This sample contains three opti- 
cal monochromatic luminosities and six X-ray monochro- 
matic luminosities. The multi-band data allow us to 
compute the r statistic for 18 different combinations of 
optical/UV and X-ray monochromatic luminosities. In 
addition, spectral fits of each AGN aid in excluding X- 
ray obscured AGNs from our sample. The harder X-ray 
luminosities available in this sample should also be less 
subject to absorption. 



1. The SDSS/XMM-Newton catalog 

The SDSS/XMM-Newton catalog contains 792 AGNs. 
We first select 340 AGNs by applying the same cuts per- 
formed in Young et al. [2] : 

1. we exclude all Radio Loud Quasars; 

2. we exclude all Broad Absorption Line Quasars; 

3. we consider only AGNs with X-ray detections char- 
acterised by a signal-to-noise ratio (S/N) greater 
than 6. The objects we are left with have net X- 
ray counts in the range 51 - 39300, with an aver- 
age of 1336 counts (so that shot noise is minimal). 
This allows for spectral fits to be made over the 
0.5 — lOkeV band for each source; 

4. we select AGNs whose preferred X-ray spectral fit 
is a single power-law (SPL) with no intrinsic ab- 
sorption; 

5. between the remaining AGNs, we choose only those 
with a good spectral fit, that is with a reduced x^ 
statistic (i. e. x^/d.o.f.) smaller than 1.2. 

The resulting set of AGNs lies in the redshift range 0.1 
- 4.4. We will refer to it as the Full catalog. 



2. Removing obscured AGNs 

We consider also a further cut on the effective X-ray 
power-law photon index F (column 10 in Tab. 2 of Young 
et al. [45]). The photon spectral index is defined as 
F = —a + 1 , where —a is the exponent obtained by 
fitting the X-ray part of the AGN spectrum with a single 
power-law model. A low F value implies a flat spectrum, 
that is a spectrum where the soft X-ray component (i. e. 
the 0.5 — 2keV band) is much weaker than the hard X- 
ray component {i. e. the 2 — lOkeV band). There is 
strong evidence [26, 40, 43, 44] that X-ray absorption is 
the primary cause of soft X-ray weakness. Intuitively, 
this happens because the low energy X-rays are more 
easily absorbed than high energy ones. 

Following Steffen et al. [24], we select only those AGNs 
where F > 1.6 , thus reducing our sample to 320 AGNs. 
We will refer to this as the High-F catalog. One can 
further reduce the impact of X-ray absorption by im- 
posing cuts on the signal-to-noise ratio; it is possible, 
for example, to form AGN catalogs where S/N > 10 or 
S/N > 20. We statistically analysed these samples as 
well and we ended up with results very similar to those 
obtained from the High-F catalog. Moreover, we applied 
cuts on the Galactic column density in the direction of 
the sources and did not find any significant correlation 
with the r-statistic. 



3. Multi-wavelength data 



Full ca\ 



A significant advantage of the sample of Young et al. 
is its many frequency bands. This consists in the rest- 
frame monochromatic luminosities at the following fre- 
quencies: 1500 A, 2500 A, 5000 A for the optical/UV side 
and 1 keV, 1.5 keV, 2 keV, 4 keV, 7 keV, 10 keV for the X- 
ray side. By applying the statistical analysis outlined in 
Sec. Ill to multi-wavelength data, we can check whether 
the r statistic varies as the X-ray frequency goes from the 
soft to the hard side of the spectrum. The ALP-mixing 
model does not predict any variation since, as long as 
we are in the frequency-independent limit (see Sec. I), 
the oscillation probability in Eq. (1) is insensitive to the 
photon energy. On the other hand, soft X-rays are more 
easily absorbed than hard X-rays. It is therefore clear 
that a multi-wavelength analysis can help discriminating 
between photon dimming due to photon- ALP mixing and 
X-ray absorption. 



Results 



In Table II we report the results we obtained by 
analysing the Full and High-F catalogs using the method 
outlined in Sec. Ill at the various wavelengths and for 
four Pmix values. Figure 6 shows the same results in a 
r-vs- X-ray energy plot. In the following discussion we 
shall always assume an optical/UV wavelength of 2500 A 
since the dependence on optical wavelength is barely no- 
ticeable. 



The outcome of the r-test significantly depends on the 
analysed X-ray frequency. Regardless of Pmix, the ALP- 
mixing model is strongly preferred in the soft X-ray band 
(IkeV, 1.5 keV and 2keV). For example, for E = IkeV 
and Pmix = 1 we have r ~ 63, a ~ Str preference for the 
ALP-mixing model. However, as soon as we move to the 
hard X-ray bands (4keV, 7keV and lOkeV), r quickly 
approaches zero: neither model is preferred. The energy 
dependence suggests that X-ray absorption, rather than 
photon- ALP mixing, represents the main contribution to 
the scatter at low energies. Where the r value is high, the 
PDFs are not especially consistent with the ALPmodel 
or the Gaussian model. 

IkeV 1.5 keV 2keV 4keV 7keV lOkeV 



-'mix 


= 1.0 


63 
9.2 


45 
-1.9 


26 
-5.4 


-0.91 

-7.8 


-2.6 
-3.9 


0.39 
-0.20 


-fniix 


= 0.7 


62 
8.6 


43 

-2.8 


25 
-6.2 


-1.3 

-8.2 


-2.8 
-4.0 


0.28 
-0.34 


-fniix 


= 0.4 


60 
10 


44 
0.87 


26 
-2.4 


0.60 
-5.6 


-1.5 
-2.6 


0.86 
0.38 


-fniix 


= 0.1 


42 
9.2 


32 

4.7 


19 
2.6 


2.4 
-0.90 


-0.12 
-0.50 


0.66 
0.55 



TABLE II - Outcome of the r-test applied to the Full (up- 
per line) and High-F (lower line) catalogs at the various 
wavelengths and for different mixing fractions. Positive 
values imply that the ALP-mixing model is preferred; neg- 
ative values imply that the Gaussian model is preferred. We 
assumed po = 0.1, but the results do not change greatly for 
different values. 
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FIG. 6 - Value of the r-statistic for the various wavelengths 
and catalogs (Pmix =1). The evidence for ALPs is signifi- 
cantly lower in those bands where the effects of absorption 
are expected to be less. 



In Figure 7a we plot the Cumulative Distribution Func- 
tions (CDF) for the ALP-mixing model and the Gaussian 
model together with the empirical CDF of the Full cata- 
log. In the soft X-ray band both the ALP-mixing model 
and the Gaussian model fail to reproduce the features 
of the scatter. The preference for the ALP-mixing model 
over the Gaussian model given by the r-test at these ener- 
gies is due to a very bad performance of the latter rather 
than to a good performance of the former. Moreover, 
both models systematically overestimate the X-ray lumi- 
nosities: X-ray absorption, again, could be the culprit of 
this discrepancy. In the hard X-ray band, where absorp- 
tion is less likely to affect the photons, the empirical CDF 
stays in between the two theoretical curves: no model is 
to be preferred. 



2. High-T catalog 

The signal for the ALP-mixing model dramatically 
drops upon removing the 20 AGNs which have lower pho- 
ton spectral index. As shown in Table II, this is true for 
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FIG. 7 - Cumulative distribution functions of the scatter for the Full (a) and High-F (b) catalogs. The empirical CDF is 
plotted along with the ALP-mixing model and Gaussian model theoretical CDFs. For the ALP-mixing model we assumed 
po = 0.1 and Pmix = 1- When Pmix < 1, the data curve stays the same while the CDF of the ALP-mixing model tends to 

that of the Gaussian model. 
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all the considered Pmix values. These are explained look- 
ing at the High-F CDFs in Figure 7b. 

At the lowest energies, the distribution is very much 
like the expected ALP prediction, especially in the 1 keV 
case. However, we do not expect the photon-ALP mix- 
ing signal to be frequency dependent, while we see the 
X-ray absorption is. This suggests also that the agree- 
ment with the ALP model is likely the accidental effect 
of a small amount of residual X-ray intrinsic absorption, 
rather than due to photon-ALP mixing. 



V. SIMULATED DATA SETS 

It is useful to understand the sensitivity to the ques- 
tion of mixing which we expect to find given the cata- 
logs we consider. Using simulated samples drawn from 
the mixing distribution, we can measure the distribution 
of r-values which would typically occur and then com- 
pare these to the r-values obtained in Sec. IV C. If we 
measure a significantly different value, this could indi- 
cate that there may be an alternative explanation of the 
signal. 

The same simulated data sets can be used to derive the 
statistical significance of the goodness-of-fit tests intro- 
duced in Sec. Ill, namely the Kolmogorov-Smirnov (KS), 
Kuiper and Anderson-Darling (AD) tests. We will do so 
in Sec. VD. 

For the models we analysed in Sec. II - the ALP-mixing 
model, the Gaussian model and the shot-noise model - , 
the PDF's are known and it is straightforward to produce 
samples consistent with each. For each assumed scatter 
model, we adjust the parameters to match those of the 
observed samples; in particular we match the the total 
variance a^^^ of the scatter distribution for each sample, 
as well as the number A'agn of AGNs in each sample. 
We consider the three combinations of A^agn and a^^^ 
corresponding to the BDS-203, Full and High-F catalogs. 
All values of A^agn and crtot are shown in Table III. 



Catalog 


Nagn 


0"lkeV fl.SkcV ""2 koV 


C"4kcV 


0"7kcV 


O"l0kcV 


BDS-203 


203 


0.340 








Full 


340 


0.357 0.314 0.300 


0.296 


0.319 


0.344 


High-r 


320 


0.282 0.259 0.260 


0.282 


0.315 


0.340 



TABLE III - Combinations of TVagn and atot used to gen- 
erate simulated scatter samples of the AGN catalogs. 



For each model and each dataset, we generate 10,000 
scatter samples, and apply the r-test outlined in Sec. Ill 
to each generated sample. Histograms of these results are 
shown in the figures below. Each figure reports also the 
measured signal as a vertical black line and its statistical 
significance {i. e. the value of the empirical GDF) in the 
legend. 



A. ALP-mixing model results 



In Figures 8, 9 and 10, we show the r-distributions 
obtained for the various catalogs for E = 2keV. Before 
comparing them with the measured signals, let us point 
out some general properties of the r-test which stem from 
these distributions. 

First, the r-test may not be sufficient to provide a con- 
clusive preference for photon-ALP mixing even if the lat- 
ter is actually happening. Gonsider the expectation value 
of the r-statistic for an AGN sample with A'agn = 203 
and CTtot — 0.34 (i. e. the BDS-203 catalog), in the best 
scenario of Pmix = 1- This amounts to (r) = 3.2, which 
corresponds to an evidence for the ALP-mixing model of 
less than 2a. Even if photon-ALP mixing were happen- 
ing, the BDS-203 catalog would usually not be enough 
to provide a detection. The statistical significance at- 
tainable with the r-test increases with TVagn and Pmix 
and decreases with the intrinsic variance of the dataset 
CT?j. For the Full and High-F catalogs, we have (r) ^ 9 
and (r) ^ 21 which correspond to a preference for the 
ALP-mixing model of 3(7 and 4.5cr respectively. 

When Pmix < 1, we expect it to be harder to discrim- 
inate between the ALP-mixing model and the Gaussian 
model and, therefore, the average of the r-statistic should 
be smaller. The simulations confirm this behaviour. For 
example, if Pmix = 0.4, then the average r-values for 
the Full and High-F data sets shrink to (r) ^ 6.6 and 
(r) ~ 13, that is a 2.6cr and 3.6cr preference respectively. 

The figures also demonstrate another issue: the prob- 
ability of a false-negative result, i. e. the r-test yield- 
ing a preference for the Gaussian model when the scat- 
ter comes from the ALP-mixing model, is not negligible 
even when Pmix — 1- We define the false-negative prob- 
ability as the value of the empirical CDF at r = 0. For 
the BDS-203 catalog, this amounts to 0.27: there is a 
27% probability that the r-test gives a false- negative re- 
sult when applied to a dataset similar to the BDS-203 
catalog. The false- negative probability for the Full and 
High-F catalogs is smaller since both have a smaller scat- 
ter variance than BDS-203, but is still significant: 11% 
and 4% respectively. 

Clearly the limiting factors are the number A'agn of 
AGNs and the intrinsic variance af^ of the scatter of the 
empirical relation taken into consideration. The contri- 
bution to the scatter from mixing is fixed (for a given 
Pmix), so if the total scatter is larger, the greater will be 
the relative contribution from the intrinsic scatter and 
the harder it is to distinguish photon-ALP mixing. Sta- 
tistically significant results require using datasets larger 
than those considered in this paper. For example, 500 
AGNs with atot = 0.26, in the best scenario of Pmix = 1, 
lead to an average evidence 5.7(T, while 1000 objects yield 
an average evidence of 7.7cr. 
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Comparison with the measured signal 

The measured r-value for the BDS-203 catafog is much 
larger than what would typically occur when applying 
the r-test to a scatter sample distributed according to 
the ALP-mixing model. As can be seen by Figure 8, 
regardless of the considered Pmix, we find the probability 
of the observed signal to be smaller than 0.4%. We obtain 
a similar result for the BDS-77 catalog. 
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FIG. 8 - r-distribution for the BDS-203 catalog at S = 
2 keV when the scatter is distributed according to the ALP- 
mixing model. The measured signal, r, is represented by 
the vertical black line, while its statistical significance is 
quoted in the legend as p(r). 

A similar situation is found for the Full catalog, where 
the measured signal is systematically higher than the ex- 
pected one - see Figure 9. The probability of such mea- 
surements reaches the 7% level for Pmix = 1- 
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that something different than photon- ALP mixing is af- 
fecting the X-ray-to-optical luminosity ratio in a way that 
mimics the mixing effect. As we already pointed out in 
Sec. IV A, X-ray absorption could be the cause. 

The High-F catalog, due to its low scatter variance, 
can potentially provide a statistically significant detec- 
tion of photon- ALP mixing. This is evident from Figure 
10, where the average outcome of the r-test can be as high 
as (r) = 20. However, the measured signal never shows 
a significative preference for the ALP-mixing model. In 
fact, if favors the Gaussian model in most of the cases 
(P,ni, = 1, 0.7, 0.4). 
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FIG. 10 - r-distribution for the High-F catalog at _E = 
2 keV when the scatter is distributed according to the ALP- 
mixing model. 



B. Gaussian model results 

The measured r-values are compatible with what ex- 
pected from a Gaussian scatter only in the following 
cases: 

1. for the High-F catalog, when E > 2keV; 

2. for the Full catalog, when E > 4keV. 

This means that the Gaussian model performs well ac- 
cording to the r-test where we expect X-ray absorption 
to be negligible. An example of this is shown in Figure 
11, where we show the r-distribution for the High-F cat- 
alog for 4 different frequencies. For frequencies higher 
than 2keV, the measured signal lies comfortably within 
the expected distribution of r-values . 



FIG. 9 - r-distribution for the Full catalog at _E = 2keV 
when the scatter is distributed according to the ALP- 
mixing model. 

The BDS-77, BDS-203 and Full catalogs all suggest 



C. Shot-noise model results 

The introduction of shot-noise on top of a Gaussian- 
distributed scatter can significantly alter the outcome of 
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FIG. 11 - r-distribution for the High-F catalog at Pmix = 1 

for 4 different frequencies when the scatter is distributed 

according to the Gaussian model. 



the r-test if less than {N) — 20 X-ray photons are col- 
lected. The impact on the r-distribution is a shift of 
its peak towards higher r-values and a broadening of its 
width. Moreover, when the number of collected photons 
is smaller than ^ 10, shot-noise can even trick the r-test 
into showing a preference for the ALP model (i. e. a 
positive r- value). For example, for the High-F catalog 
and when {N) = 5, the r-test yields an average value 
of 12, that is a ~ 3.5cr preference for the ALP-mixing 
model. All of this is clear from Figure 12, where we show 
the r-distributions for the High-F catalog for 4 different 
models: ALP-mixing model, Gaussian model. Shot-noise 
model with {N) — 5 and Shot-noise model with {N) = 10. 



Gaussian model 
<r> = -8.2 
Shot-10 model 
<r> = -2.6 
Shot-5 model 
<r> = 12 
ALP model 
<r> = 20 




FIG. 12 - r-distribution for the High-F catalog at _E = 
2 keV and Pmix = 1 for 4 different models. The shot-noise 
models stay in between the Gauss and ALP-mixing models. 

We do not expect the Full and High-F catalogs to be 
affected by shot-noise since all their AGNs are detected 
with far more than 20 X-ray counts. However, a non- 



negligible number of AGNs in the BDS-77 and BDS-203 
catalogs - around the 10% of the total, see Sec. IV A 
- have photon counts below 20. Therefore, shot-noise 
may contribute to the high r-values measured for those 
catalogs. Nevertheless, the impact of shot noise is likely 
to be much smaller than that due to X-ray absorption. 



D. Goodness-of-fit tests 

We tested the measured scatter against the ALP- 
mixing, Gaussian and shot-noise models by making use of 
the Kolmogorov-Smirnov, Kuiper and Anderson-Darling 
tests. The Kolmogorov-Smirnov and Kuiper tests tend to 
be more sensitive to the centers of the distribution, while 
the Anderson-Darling is weighted to emphasize the tails. 
For every dataset. X-ray frequency and scatter model, 
we calibrated the goodness-of-fit tests by applying them 
to 50,000 simulated scatter samples. Table IV shows the 
statistical significances obtained by comparing the mea- 
sured statistics to the simulated distributions. Values 
lower than 0.05 imply a rejection of the model at the 2ct 
level. 

The most obvious result of the goodness-of-fit tests is 
that none of the distributions does that well in repro- 
ducing the data. Regardless of the analyzed model or 
dataset, most of the measured signals have a statistical 
significance smaller than 10%. Every test fails at the 
la level, and only one data set is accepted in all three 
tests at the 2a level (the BDS-77 set for the ALP-mixing 
model) . 

The r-test is most sensitive to the tails of the distribu- 
tions, so we expect the AD test to be the best predictor 
of the r value. Indeed, this is the case: where the r value 
is high, the AD test gives a very low value for the Gaus- 
sian distribution, and higher values for the ALP-mixing 
model. However, though the values are higher for the 
ALP-mixing model, they remain far lower than would 
typically be expected. This is consistent with what can 
be seen from the distributions in Fig. 4 and 7. (Note 
that the figures show only the left tail.) 

In the soft X-ray band, most of the tests yield very low 
significances, particularly in the Full data set (the Full 
catalog is never fit by any model for E < 2keV.) This 
suggests that none of the distributions well reflects the 
data and that something, such as X-ray absorption, is 
missing in the models. This is generally improved when 
going to the harder X-ray bands or the better cleaned 
High-F data set. This is not surprising, as the scatter 
in these cases tends to fall in between the ALP-mixing 
model and Gaussian distributions. 

These tests generally confirm what we have seen above: 
preference for the ALP-mixing model, as reflected in the 
r statistic, should not be confused with the data being 
consistent with this model. Indeed, there are strong indi- 
cations that the simple models we consider fail to explain 
the data, particularly in the soft X-ray band. 
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IkoV 


1.5 keV 




2keV 






4keV 




7kcV 




lOkcV 




High-r 


0.17 0.15 0.0021 


0.13 0.3 6.2e-4 


0.061 


0.22 


4.2e-4 


0.078 


0.028 6.6e-4 


0.061 


0.18 


0.0051 


0.033 0.023 0.017 


ALP 


Full 

BDS-77 

BDS-203 


0.03 


2e-4 8e-5 0.064 


0.028 
0.27 
0.056 


0.0042 
0.11 
0.008 


0.014 
0.067 
0.067 


0.082 


0.22 0.0026 


0.024 


0.067 


0.008 


0.036 0.046 0.026 




High-r 


0.005 2.2e-4 0.015 


0.035 0.004 0.017 


0.005 


7e-4 


0.011 


0.043 


0.011 0.014 


0.089 


0.03 


0.035 


0.051 0.0039 0.071 


Gauss 


Full 

BDS-77 

BDS-203 


3.6e-4 


0.001 


4e-5 
0.062 
0.012 



0.029 
0.0035 


0.0031 
1.6e-4 
3.2e-4 


0.06 


0.0068 0.031 


0.039 


0.0053 


0.046 


0.073 0.0083 0.099 




High-r 


0.064 0.039 0.0016 


0.026 0.042 0.001 


0.01 


0.041 


9e-4 


0.019 


0.11 7.8e-4 


0.024 


0.057 


0.0047 


0.014 0.0054 0.013 


Shot 


Full 

BDS-77 

BDS-203 


0.0019 


8e-5 0.0044 


0.0084 
0.19 
0.032 


7.6e-4 
0.062 
0.0023 


0.011 
0.0019 
0.0021 


0.026 


0.12 0.0026 


0.0087 


0.016 


0.0059 


0.018 0.014 0.021 



TABLE IV - Statistical significance of the KS, Kuiper and AD statistics obtained for the various catalogs. For the ALP- 
mixing model we assumed po = 0.1 and Pmix = 1; for the Shot-noise model we used (A'') = 5. 



VI. CONCLUSIONS 

The ratios of luminosities of high redshift objects, in- 
troduced by Burrage et al. [1], offers a new avenue to 
explore the possibility of strong mixing between photons 
and axions, which in principle can have significant power 
to constrain these models. However, much effort must be 
taken to ensure a homogenous sample of objects, so that 
any scatter is due to the coupling to axions. 

Here, we have reproduced the analysis of Burrage ct al. 
studying scatter of the empirical relation between the 
optical/UV and X-ray monochromatic luminosities of 
AGNs. In addition to the AGN catalogs already anal- 
ysed by them - BDS-77 and BDS-203 -, we considered 
two samples from the SDSS/XMM-Newton Quasar Sur- 
vey [2]: Full-340 and High-r-320. These data sets have 
multi-wavelength information which has allowed a more 
thorough investigation of the model constraints. 

We ran statistical analyses on 18 combinations of 
optical/UV and X-ray monochromatic luminosities and 
found no compelling evidence for the presence of ALPs. 
Whenever we found a signal compatible with the presence 
of ALPs, it was coming either from the soft X-ray band, 
where absorption is more likely to happen, or from the 
Full catalog, which is thought to include more absorbed 
AGNs. Moreover, the ALP-mixing model fails to repro- 
duce the features of the scatter even where it is preferred 
over its competing model, the simple Gaussian model. 

By means of simulations, we calculated the distribution 
of the r-statistic used by Burrage et al. to discriminate 
between the ALP-mixing model and the Gaussian model. 
We found out that actual data yields values of the r- 
statistic which are either significantly higher (BDS-203 
and Full catalogs) or lower (High-F catalog) than what 
we would typiclally expect if photon- ALP mixing were 
taking place. Stated differently, we see ALP evidence 
where there should not be sensitivity, and we do not see 
evidence in the majority of data sets which should be 
most sensitive. 

This behaviour suggests that another source of scat- 



ter, such as X-ray absorption, is taking place and casts 
doubts on the suitability of the r-statistic alone to esti- 
mate photon-ALP mixing. This is supported by a de- 
tailed examination of the low luminosity sources which 
dominate the statistical tests. We found that the BDS-77 
and BDS-203 catalogs respectively contain one and two 
AGNs which are known to be X-ray-weak independently 
of photon-ALP mixing. Upon removing these sources, 
the value of the r-statistic significantly drops in both 
cases, as does the preference for the ALP-mixing model 
over the Gaussian model. 

Thus, while this new method for observing photon- 
ALP mixing is in principle very powerful, considerable 
care must be taken. Given the many ways that scatter 
can be introduced into the relations between the low and 
high energy luminiosities of objects, evidence for mixing 
must be closely examined that it fits the expected sig- 
nal. In particular, the scatter should follow the expected 
PDF and be independent of the energy of the high energy 
photons. Ideally, it should also be observed in multiple 
classes of objects, where their intrinsic scatter have dif- 
ferent physical origins. 

This also highlights the importance of finding classes of 
objects where the intrinsic scatter is small or well under- 
stood. Since the scatter from mixing is well understood 
and fixed, if any sample is observed with less scatter than 
expected from mixing, then mixing can be ruled out, or 
at the very least, the probability of mixing, Pmix, can be 
constrained. 
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